A research team led by LI Xuefei at the Shenzhen Institutes of Advanced Technology (SIAT) of the Chinese Academy of Sciences, collaborating with TIAN Liang’s team from the Hong Kong Baptist University, developed a deconvolution algorithm called DeSide. This algorithm, based on deep learning and publicly available scRNA-seq datasets, can accurately estimate the abundance of 16 cell types across 19 types of solid tumors. The study was published in PNAS.
Solid tumors consist of various cell types including cancer cells, endothelial cells, immune cells, and fibroblasts. Previous studies have shown that the proportions of these cell types can strongly correlate with cancer progression, highlighting the importance of accurately assessing tumor cellular composition for clinical insights and therapy.
Current methods for investigating tumor cellular composition such as flow cytometry and single-cell RNA sequencing (scRNA-seq) are typically costly and may suffer from low extraction efficiency. Therefore, a series of studies have aimed to estimate cellular compositions in bulk RNA-seq samples, but the methods often perform inconsistently across different tumor types.
To address this challenge, researchers integrated 12 single-cell RNA-seq datasets from six solid tumor types, creating a comprehensive reference for synthesizing virtual bulk RNA-seq data (training set). They introduced an innovative sampling method to generate virtual bulk tumors with a wider range of cell proportion combinations. To enhance the quality of synthesized data, they filtered genes and synthesized gene expression profiles, reducing input data complexity and increasing similarity to real tumor expression patterns.
DeSide introduces a novel architecture of the deep neural network, utilizing two fully connected networks to extract information from both biological signaling pathways and gene expression profiles. Moreover, DeSide refines the network’s activation function, effectively reducing prediction errors for tumor cell proportions.
DeSide outperforms existing methods in predicting cell abundance across 17 tumor types, and shows significantly better performance in specific tumors compared to current referenced-base models. In addition, DeSide can accurately predict cellular compositions for cancer types absent from the training set, demonstrating strong generalization potential.
This study offers robust methods to better understand the tumor microenvironment, enhance prognostic evaluation, and facilitate the development of targeted therapies.
86-10-68597521 (day)
86-10-68597289 (night)
86-10-68511095 (day)
86-10-68512458 (night)
cas_en@cas.cn
52 Sanlihe Rd., Xicheng District,
Beijing, China (100864)